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Abstract 

In this paper, we use a finite-state continuous-time Markov chain 
with one absorbing state to model an individual’s lifetime. Under this 
model, the time of death follows a phase-type distribution, and the 
transient states of the Markov chain are known as phases. We then 
attempt to provide an answer to the simple question “ What is the con¬ 
ditional age distribution of the individual, given its current phase ”? We 
show that the answer depends on how we interpret the question, and 
in particular, on the phase observation scheme under consideration. 
We then apply our results to the computation of the age pyramid for 
the endangered Chatham Island black robin Petroica traversi during 
years of intensive conservation efforts in 1980-1989. 

Keywords: Phase-type distribution; Transient Markov chain; Age dis¬ 
tribution; Petroica traversi 


1 Introduction 

A random variable has a phase-type (PH) distribution if it corresponds to the 
time until absorption of a transient Markov chain with one absorbing state. 
PH distributions, introduced in the early 1980’s by Neuts [TO, Chapter 2], 
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form a class of distributions with considerable modelling versatility, which 
results from attractive probabilistic properties. The set of PH distributions 
is closed under convolutions and finite mixtures, and is dense in the class 
of all distributions with non-negative support. PH distributions have there¬ 
fore been extensively used in practice, in particular for modelling lifetime 
distributions, see for instance H]> 0, and [7]. 

The question addressed in this paper arose initially when modelling the 
lifetime reproductive success of the black robin Petroica traversi , which is 
an endangered songbird species endemic to the Chatham Islands, an isolated 
archipelago located 800km east of New Zealand. By 1980, the population of 
black robins had declined to five birds, including a single successful breed¬ 
ing pair, on Mangere Island |2]. Through intensive conservation efforts in 
1980-1989 by the New Zealand Wildlife Service (now the Department of Con¬ 
servation), the population recovered to 93 birds by spring 1990 |S|. Over the 
next decade (1990-1998), the population was closely monitored, but without 
human intervention. Nevertheless the population continued to grow rapidly 
to 197 adults by 1998, but after this period, the population growth slowed 
considerably and it only reached 239 adults in 2011 |9|. 

The black robin population is modeled in a parallel studjQ using a special 
class of branching processes called the Markovian binary tree (MBT), in 
which an underlying transient Markov chain controls the reproduction and 
death events of each individual in the population. A direct consequence 
of this model is that each bird lives for a random time which has a PH 
distribution, in which it progresses through states (also called phases) of a 
continuous-time Markov chain and dies when the chain moves to an absorbing 
state. In this application, the phases do not have any particular physical 
interpretation, their role is to increase the accuracy and realism of the MBT 
model, as opposed to the simplest linear birth-and-death model. In other 
real-world applications, the phases may have a physical meaning, such as in 
|Tj where they model physiological ages, which can be interpreted as relative 
health indices, as opposed to chronological age. 

MBTs have proved to be powerful stochastic models in population biology 
and demography f5j. Having fitted an MBT to real data, we can calculate 
properties of the population, such as the probability that it will become 

1 S. Hautphenne, M. Massaro, E. S. Kennedy, and R. Sainudiin. Modelling of the 
Chatham Island black robin Petroica traversi populations using branching processes: In¬ 
formed management strategies for reintroduction of endangered species. In preparation 
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extinct in some time interval [0, t], and the distribution of the population size 
at time t. In particular, the model allows us to compute the expected number 
of birds in phase j at time t, and the asymptotic frequency of phase j in the 
population. However, the latter two quantities may not have significance 
for biologists, who are likely to be interested in age-specific, rather than 
phase-specific, properties of individuals. We therefore need to be able to 
translate information about the phase distribution into information about 
the age distribution; that is, we need to answer the question “ what is the age 
distribution of a bird, given its phase ”? 

Note that the reverse exercise of translating information about age into 
information about phase is much easier since the distribution of the phase 
at any given age is well known. The main difference is that age is determin¬ 
istic, while phase is random and an individual stays in a given phase for an 
exponentially distributed amount of time. 

In fact, defining the event that an individual is in phase j is already 
not trivial. It is necessary to describe in more detail how the individual is 
sampled. We suggest three sampling schemes according to which an observer 
looks at the phase of an individual: 

• first, we assume that individuals are born according to a Poisson process 
which started infinitely far in the past, and one observes the phase of 
a randomly selected individual at time 0; 

• second, no assumption is made on the birth process, and the observa¬ 
tions of individuals occur according to a Poisson process, in which case 
we allow a single or multiple observation(s); 

• third, a single observation occurs at a uniformly distributed random 
time within some time window. 

For each observation scheme, we compute the conditional age distribution 
given the observed phase, as well as related quantities. We show that the age 
distribution conditional on a single rare Poisson phase observation coincides 
with the age distribution conditional on a single rare uniform phase observa¬ 
tion. Moreover, this age-distribution also corresponds to the age distribution 
of a randomly selected bird in a given phase at time 0 in the process where 
individuals are born according to a Poisson process. 

The questions addressed in this paper, and their proposed answers, are 
not restricted to the context of PH distributions and ageing processes; they 
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have a wider interest in Markov chain theory. Indeed, if Q is the generator 
of a continuous-time Markov chain {X(f) : t > 0}, then it is well known that 
P[X(t) = j|X(0) = i] = [exp(Qf)]p, but the conditional distribution of the 
time t elapsed since the start of the Markov chain, given that the chain is 
observed in phase j is much less explored. As we observed above, the nature 
of this observation event needs to be described carefully. 

The paper is organised as follows. In the next section, we provide some 
background on PH distributions used to model the ageing process of individ¬ 
uals. In Section 3, we compute the conditional age distribution at time 0 in 
the Poisson birth process, given the observed phase. In section 4, we consider 
the Poisson phase observation scheme and provide the conditional age distri¬ 
bution with a single or multiple observation(s). In Section 5, we consider a 
uniform observation scheme, and in Section 6 we discuss the rare observation 
limit of the results obtained in Sections 4 and 5. Finally, in Section 7, we 
illustrate our results on a toy example first, and then on the computation of 
the age pyramid for the black robin population. 

2 The phase-type lifetime distribution 

We assume that the lifetime of an individual is a random variable L which fol¬ 
lows a phase-type PH(a, Q) distribution with m transient phases {1, 2,..., m } 
and the absorbing phase 0. This PH distribution is parameterised by an 
1 x m vector a which gives us the initial distribution of the underlying 
Markov chain, and an m x m matrix Q containing the transition rates be¬ 
tween the transient phases. So the assumption is that the lifetime of an 
individual progresses through phases (which may or may not correspond to 
some physically-observable characteristics) according to a realisation of the 
Markov chain, and the individual dies when the chain moves to the absorbing 
phase 0. 

The PH(a, Q) distributed random variable L has a density and a distri¬ 
bution function respectively given by 


ae Qx q 0 
1 - ae Qx l, 


h(x) 

F L (x) = P[L < x] 


( 1 ) 


where q 0 = (—Q)1 is the absorption rate vector, and 1 is a column vector of 
ones. Let ip(x) denote the phase of the individual at age x, and let j be any 
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transient phase. Another basic result on PH random variables tells us that 
the probability that an individual is in phase j when its age is x is 

P[phase = j | age = x\ = P[y>(x) = j] = (ae Qx )j = ae Qx ej, 

where e 3 is the jth unit (column) vector. Our question is the reverse of 
this: “ If we observe an individual in phase j, what can we say about its age?" 
Bayes’ Theorem gives us 

. P[phase = j | age < x] P[age < x] 

P age < x phase = j =--q-• 

P [phase = j\ 

The problem is that we do not yet have anything in the model to make sense 
of P [age < x\ and P [phase = j }. Our aim is in this paper is to find satisfying 
answers to the above question. 


3 A Poisson birth process 

One assumption that we might make is that individuals have been born at the 
epochs of a Poisson process with parameter /3 over the time interval (—oo, 0), 
and that we observe the phase of a single individual randomly taken from 
the population at time 0. Let Fj(s) be the probability that the age of the 
randomly-selected individual in phase j at time 0 is smaller than s. 


Lemma 3.1 For any s > 0, 




Q;e Q«( —Q) l e . 

a(-Q)- 1 e i 


Proof. For any T > 0, 


( 2 ) 


• the number N T of individuals born in the time interval [— T, 0) has a 
Poisson distribution with parameter /3T, 

• conditional on N T = n, the birthtimes of the n individuals are uniformly 
and independently distributed on the interval [—T, 0), 

• an individual born at time u G [—T, 0) will be alive at time 0 with 

probability by ([[]). 
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So, an individual born in the interval [—T, 0) will be alive at time 0 with 
probability (1 /T) f_ T ae^~ u ^l du. It will be alive and in phase j at time 
0 with probability ( 1 /T) J°_ T ae ci ^~ u ^ej du, and it will be alive, in phase j 
and older than s at time 0 with probability (1/T) f/ T cxe^~ u ^ej du. So the 
probability that an individual in phase j at time 0 is older than s is 

J/ T ae Q (~“)ej du 

J° T ae c h-“)e J du 

The fact that any PH distribution has a finite mean allows us to let T — y oo 
and, changing the variable of integration, we arrive at the conclusion that 
the probability that a randomly-selected individual in phase j at time 0 is 
older than s is 

/'/° cxe^Bj du ae Qs (-Q) _1 ej 

/ 0 °° OLe^ u ej du a.(—Q)~ 1 ej 

which completes the proof. □ 

Note that this model corresponds to an M/PH /oo queue, and (J2J) gives 
the distribution of the age of a randomly selected individual in steady state. 
Also observe that Fj(s) does not depend on the rate f3 of the Poisson birth 
process. 

The above analysis is nice. However the birth process in an MBT is not 
Poisson and, more generally, there is no reason to believe that a Poisson 
process is a good model for births. In the next three sections, we shall 
follow an alternative approach: without making any assumption on the birth 
process, we look at just a single individual and proceed by explicitly putting 
the observation process into the model. 

For further use, we denote the age of the individual at the time of obser¬ 
vation as A 0 and the observed phase as ip a . We are therefore interested in 
computing 

P[Ao<s\(p 0 =j]. (3) 


4 The Poisson observation scheme 

In this section, we assume that, following its birth, the phase of an individual 
is observed according to a Poisson process with rate 7 . We first compute the 
age distribution given the phase at the first observation time. Then we gen¬ 
eralise our results to the age distribution at the time of the last observation, 


6 





given that the observer records the phases at k > 2 successive time events of 
the Poisson process. 

4.1 Single observation 

By the properties of Poisson processes, the rate at which an individual is 
observed when it is in phase j is 7 , for any j. A slight modification of 
the underlying phase process then allows us to compute the conditional age 
distribution of the individual at the first observation time, given that the 
individual is in phase 1 < j < m at that time. It suffices to add m absorbing 
phases 1 ', 2 ,m' (one per transient phase), to the process so that phase j' 
is reached when the individual is observed in phase j. The initial distribution 
ct stays unchanged, but now the transition rate matrix becomes 

t( 7 ) = q- 7 l 

and there are m + 1 absorption rate vectors 

9o>M7)> -■■>**»'(7)» 

where q 0 = — Q1 records the rates of absorption into phase 0 (corresponding 
to the death of the individual), and for V < j' < m', £y( 7 ) = 7 ej records 
the rates of absorption into phase j' (corresponding to the observation of the 
individual in phase j). For the sake of clarity of the presentation, we shall 
drop the dependence on 7 in T( 7 ) and ty { 7 ) and use the simpler notation 
T and ty in the sequel. 

For any phase j (transient or absorbing), let 

B{j) = infjt > 0 : <p(t) = j} 

be first time the individual enters phase j, with B(j) = 00 if the individual 
never enters phase j. Then, with probability one, precisely one of the random 
variables 

{ J B(0),5(T),i?(2 / ),...,5(m')} 

is finite, and the age distribution conditional on the observed phase being j 
can be rewritten as 

P [A 0 <s\Lp 0 = j) = P [B(j') < s | B(J') < 00 ]. (4) 

Based on this observation, the next proposition provides an expression for 
the conditional age distribution. 
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Proposition 4.1 The age distribution of the individual at the first observa¬ 
tion (event) time of a Poisson (7) process, conditional on the observed phase 
being j, is given by 


P\A <s\v — — 1 - a ex P( Tg ) (~ T ) e 3 - 

P[A 0 _s\ip 0 af—T')- 1 e,- 


( 5 ) 


Proof. We have 

?[-£>(/) < s | B(j') < 00] = 1 —P[B(J') > s | B(j') < 00] 

i _ P[s < B(j') < 00] 
P[fi(j') < 00] 

On the one hand, 

/ OO 

a. exp(Tw) tpdu 
= aexp(Ts) (—T) _1 7 ej, 
and on the other hand, since P [B(j') > 0] = 1, 

P W) < 00] = P[0 < B(j') < 00] =a(-T)-V / , 


( 6 ) 


which, with (141) . completes the proof. □ 

The age at the observation time, conditional on the observed phase being 
j, can be written as the sum of two random variables, 


A 0 — Yj + Zj, 

where Yj denotes the last time that the Markov chain entered phase j before 
observation, and Zj denotes the sojourn time in phase j between this time and 
observation, both random variables being conditionally independent given 
( p 0 = j. The random variables A 01 Yj , and Zj are illustrated in Figure [U 
Besides purely theoretical interest, the distribution of Yj and Zj may have 
practical interest when the phases have some physical interpretation (such 
as physiological ages), and an observer who sees an individual in phase j is 
interested in knowing the chronological age at which the individual entered 
that particular phase (Yj), or for how long he/she has been in that phase 
(Zj). The respective distributions of Yj and Zj are computed in the next two 
propositions. 




PP( 7 ) 


▼-▼-▼-> 



Figure 1: A Possible trajectory of the phase process until absorption. The 
observation process is a Poisson process with rate 7 (denoted as PP( 7 )), 
where the ▼ symbols represent the events, and the phase is 7 = 1 at the first 
observation event. 


Proposition 4.2 The conditional distribution of Yj, given ip 0 = j, has a 
point mass at zero given by 


P[Yj = 0 | To = j\ 


a 4 


(a j + 7 M-t)-V 


( 7 ) 


and for y > 0, 


P[Yj <y\Vo = 3} 


ae T «(-T)- 1 (Q +Ajl)ej 
(Xj + 7 )a(-T)- 1 e j 


( 8 ) 


where Xj = —Qjj- 


Proof. First, recall from (J 6 ]) that 

P [To = 3} = P W) < 00 ] = a (—T)- 1 7 ej. 

Let T a be the time at which the individual is observed, the clock being set at 
the individual’s birth time. Thanks to the memoryless property of exponen¬ 
tial random variables, T a has the same distribution as the interarrival time 
in the Poisson observation process, that is, T 0 is exponentially distributed 
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with parameter 7 . Then, for any y > 0, by conditioning on the value of T a , 
we have 


Ppjf e [y,y + dy], v 0 = j] 

noo 

= / P K e [y>y + d y], <p( u ) = j\T 0 e [u,u + du ]] 7 e~ lu du 

Jo 

/ oo 

J2(ae Qy )kQkje-^ u ~ y) dyye^du. 

" k^j 

Next, observe that since Qjj = —A j, we have 

= «e Qs (Q + A, l)e r 

k^j 

As 


e“ A ' (u “ v) 7 e~ lu du = 


ye 


-iy 


Aj + 7 ’ 


and T = Q — 7 I, we have 


P K' e [j/,J/ + dj/], Vo = 3\ = 


cx.e Ty { Q + A j I)ej 7 dy 
Xj + y 


Similarly, 


P[Yj- = 0, </? 0 = j] = / aje- XjU ye- lu du = 


Qb7 
Aj +7’ 


(9) 


which leads to (17|) . Finally, (JSJ) follows from (d) and (J5J) since 

P K' < y | Vo = j] = P K = 0 I v^o = j] + [ P[Yj e [u, u + du], ip 0 = j], 

Jo 

for any y > 0 . □ 


Proposition 4.3 The conditional distribution of Zj, given <p 0 = j, is expo¬ 
nential with parameter 7 + A j. 


Proof. Let Sj denote the sojourn time of the underlying Markov chain in 
phase j, and let denote the event that upon leaving phase j , the chain 

moves to phase j'. We have P [E^y] = y/(Xj + 7 ), and 


therefore 


P[Zj>z\<p 0 = j] = = e-^>‘ 


which proves the statement of the proposition. 


□ 
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4.2 Multiple observations 

We now assume that the observer makes k > 2 observations of an individual 
according to a Poisson process with rate 7 . We further assume that the 
individual is still living at the time of the last (fcth) observation. We shall 
compute the age distribution at the time of the kth observation, given the 
sequence of phases observed at the observation times. 

For that purpose, we consider the same process as in the single obser¬ 
vation case, with absorbing phases 0,1',..., m!. Every observation event 
corresponds to a phase absorption in one of the phases 1 ',..., m! . After 
a phase absorption in j' (which corresponds to an observation of phase j), 
the process instantaneously starts again in phase j, that is, with the initial 
distribution vector ej, until the next absorption event. 

In order to properly define the quantities of interest, we need to redefine 
the random variables B(j ) as follows: for any initial phase distribution 0 and 
for any phase j, 

B e (j) = inf{t > 0 : ip(t) = j , yj(0) ~ 6} (10) 

is the first time the process reaches phase j, given that the initial phase fol¬ 
lows the distribution 6. For the sake of clarity, we shall write Bfij) instead of 
B e j(j) when the process starts in phase £ with probability one. Let j 1 ,... ,jk 
be the k successive observed phases. The age of the individual at the last ob¬ 
servation time conditional on the observed phases, denoted by A 0 (j 1 ,..., A) 

(or by A 0 when there is no confusion), is then given by 

k 

Mh, ■■■,jk) = B a (j[) + 

i =2 

The age distribution at the time of the last observation conditional on the 
sequence of observed phases can then be written as 

P [A 0 < s | B a (j[) < 00, B h (j' 2 ) < 00,... , B jk l (j' k ) < 00]. 

In order to compute this distribution, we need the following lemma, which is 
a particular case of Theorem 1 in [3]: 

Lemma 4.4 For 2 < % < k, define the m x m matrix 

/ An«l A „A22«2 A a ^ e A -ii{ s - u l----~ u i-l)rill 

I e A12 e /A.23 • • • e au i 

Ui- 1=0 


B u (s) = 


^S—U 1 


1 m=o J 112=0 


11 


where s > 0 and A I? are constant m x m m.atrices. If the km x km block- 
structured matrix A ^ is defined by 


A ( fc ) 


A^i A^2 0 0 ... 0 

0 A22 A23 0 ... 0 

0 ... 0 0 A(fc_i)(fc_i) A(fc_i)fc 

0 ... 0 0 0 Afcfc 


then 

Bu M = (fh ® I m )e AW '(A,, ® I„), 
where f ki is a k x 1 unit vector such that (f ki )j = S^. 

For the purpose of computing the conditional distribution of A 0 , 
the km x km matrix A ^ for any k > 2 as 

’ T e^eT 0 0 ... 0 

0 T e A eT 0 ... 0 


A ( fc ) = 


( 11 ) 

□ 

we dehne 


( 12 ) 


0 ... 0 0 T e 7 ' el 

Jk -1 3 k -1 

0 ... 0 0 0 T 


Proposition 4.5 For an arbitrary k >2, the age distribution of the individ¬ 
ual at the kth observation time, conditional on the successive observed phases 
being j u j 2 ,.. .,j k , is given by 

P[A 0 < s | B a (j[) < 00, B n (j' 2 ) < 00, ..., B jk _ x (j' k ) < 00] = ( 13 ) 

where 

a (/_ e T.)(_T)-i e , i neJ(-T)-'e, i+1 

2=1 

/c Ac—1 

a Y, Bu Wt-T)- 1 ^; e ji(—T) -Ie jf +1 . (14) 

i=2 £=i 

“(-'f)- i = J n 4 (-T)\., (is) 

2=1 

where Bu{s) is defined in (Hill and is given in (fT2]h 


N k (s) = 


D k — 
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Proof. We have 


P [A 0 < s | B a (j[) < oo, B h (j' 2 ) < oo,..., B jk l (j' k ) < oo] 
p [Ao < s , B a (j[) < oo,..., Bj k _ 1 (j' k ) < oo] 
p [ B M) < oo, B h (j' 2 ) < oo,..., Bj k _^j' k ) < oo] 

N k {s) 

Dk 

We shall prove using induction on k that 

EM = EM (,« 

D k D„ ’ [ ’ 

where N k (s) and D k satisfy (11411 and (1 15 . respectively. Recall that ty = ye,,- 
for any absorbing phase V < j' < m!. When k = 2, 


D 2 = P [B a (j[) < oo,B h (j 2 ) < oo] 

= a(-T)- 1 t j( eJ 1 (-T)- 1 ^ (17) 

= i 2 d 2 . 


Further, by (flUl) and by conditioning on the value of the absorption times 
B a {j[) and B h {j ' 2 ), we have 

^ 2 (s) = p [(B a (j[) + Bfrdz)) < s, B a (j[) < oo, B h (j 2 ) < oo] 


' u =0 


a.e Tu t j / 

J 1 


' v=0 


e Y Ji e Tv ty 2 dv du 


7 2 J e T(s u) )(-T) 

7>(I - e Ts )(-T)- 1 e jl eJ i (-T)- 1 e,, - a6 12 ( S )(-T)" 1 ^} (18) 

7 2 iV 2 ( S ), 


where 


Ri 2 (s) = 


f u =0 


e T “e :; -'eJ i e ,:r(s “■’du. 


Using Lemma 14.4] this matrix integral can be evaluated explicitly by defining 
the 2 m x 2m block-structured matrix 


A^ = 


T e fi e l 

0 T 
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so that 


Bi2(s) = (fl 1 ®I m )e AWa (f 2t2 ®I m ). 


Therefore (fTfil) holds for k = 2. 

We now assume that (fTHl) holds for k, and we need to prove that is still 
holds for k + 1. We can decompose the conditional age at the {k + l)st 
observation, A a (j i,... ,jk+i), into the sum of the random variables B a (j[) 
and A 0 (j 2 ,... which are conditionally independent given j 1 . Note that 

A a (j 2 ,... ,jk+i) is now conditional on the phase process starting with initial 
distribution vector rather than ot , and the first observed phase is j 2 rather 
than j \, etc. To avoid confusion, we shall use the notation A a (j 2 ,..., jk+i ) 
(or A a for short), W(s), D^, &u(s) whenever we will be in that situation. 

We use the convolution formula for the sum of the two conditionally inde¬ 
pendent variables and A 0 , together with the conditional distribution 

of given in ((5j) and the induction assumption, to obtain 


P[(B a (j v ) + A 0 ) < s | B a (j[) < oo, B h (j' 2 ) < oo,..., B jk (j' k+1 ) < oo] 


, P { A ° - S ~ U \ B M) <oo,..., B jk (j' k+1 ) < oo] du 


l 



We immediately see that the denominator of the above expression, D^+i '■= 


a(—T) l ej^Dk, corresponds to (fT5l) for k + 1. It remains to show that the 

rmmpratnr A T, ( q! •= f rvp^- u p -i Af, (q — 'lArfai rnrrp>sr>nnrls tn fl 1 1 h fnr /:' — j 




l 


s 



i =2 


a(I - e Ts )(—T) - etB 12 (s)r hk 



Using Lemma [4.41 and fllip . we can show that 
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so that by properly redefining the indices we finally obtain what we need. □ 


Using dim , the expressions for N k (s) and D k can be rewritten as 

, T / > Ts aW? 

Nk{s) = cra k - cte u k - v k e w k , 

D k = cxup. 


(19) 

( 20 ) 


where 


Uk 


v k 


w k 


k -1 


(-T) -1 ^ IK(- T ) _1 e 


i=l 


(fk,l ® «) 


rt ' 

Ji+1 


k -1 


£(/M®«(-Trs ; nd(- T r le >; +1 - 


i =2 


£=i 


( 21 ) 

( 22 ) 


In the expression for w k , an empty product (when i — k) is interpreted as the 
scalar 1. Note that it is also possible to express N(k) and D(k) recursively 
as follows: for k > 3, 


N k (s) 

D k 


[N k -i(s)e jk _ 1 - (/Ji ® «)e A(fc)s (/ fcifc (8) / m )](-T) H fk 
Dk-iej^i-T) tj' k , 


where N 2 (s) and D 2 are given in dlSll and (1171) respectively. 

We now assume that the individual is still alive at the time of the fctli 
observation, but is discovered dead (that is, in phase 0 ) at the time of the 
[k + l)st observation. We are then interested in the conditional lifetime 
distribution of the individual, given the sequence of observed phases. Indeed, 
the lifetime L it is then given by the age at the fctli observation plus the 
time until absorption from the last observed phase j k to phase 0 , conditional 
on this time being less than the time between the fcth and the (k + l)st 
observation. We shall need the following lemma: 


Lemma 4.6 Let X ~ PH(0,T) and Y ~ Exp( 7 ). The conditional distribu¬ 
tion of X, given that X < Y, is given by 


P[X <x\X <Y] 


1 - 7 0( 7 I - T)- 1 ! + 6e^-^ x {j I - T) - 1 T1 
1 — 7 0(yl — T) _1 l 


and the density is given by 

fx\ x<y{x) 


6 > e (T- 7 i) a; (_ T ) 1 
1 — 76 ( 7 ! — T) _1 l 
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Proof. The distribution is obtained by conditioning on the value of Y. The 
expression for the conditional density then follows. □ 

The conditional lifetime distribution is computed in the next proposition. 


Proposition 4.7 The lifetime distribution of an individual, conditional on 
the sequence of observed phases being ji,j 2 , ■ ■ ■ ,jk, 0, is given by 

P[L<s | B a (j[) < oo,..., %(0) < oo] 

= C k {cxu k e] k [e^-^ s - I]( 7 I - T ) _1 + al 12 (s) + v k J lk (s)}Tl, 

(23) 

where u k and v k are given in CB and (J22]) respectively, and 

Ck = [au t (l- 7 eT( 7 I-T)- 1 l)]- 1 , (24) 





Z 12 

(*) 

= ifh 

® I 77 I 

)e B(k)s if 2 , 2 ®Im), 




J\ k{s) 

= [I mk j 

O 

3 

?r 

1 p cW. 

mj c 

[ 0 mxmk'i J-m] 

with 











B (fc) 


T 

U ^l 


C (fc) = 

" A (fc ) w k^] k 




0 

T — 7 I _ 


1 

0 

H 

I 

-3 

hH 

1 _ 

Proof. 

We have 







P [L < 

s 

BM) 

<00,.. 


D< 

OO, B jk (0) < Oo] 


k 

= P B hi- 1 ) O'D) + x < s I B M ) < OO, ..., B jh ( 0) < oo] (25) 

i =2 

where X ~ P11 (o), . T) and X is taken conditionally on X < Y, where Y is 
the interarrival time in the Poisson observation process, which is exponen¬ 
tially distributed with parameter 7. Since the age at the kth observation, 
given by B a (j[) + ’ffn =2 B .i(i-r> C);)> an< ^ the residual life time, X, are condi¬ 
tionally independent given the phase at the kth observation, j k , we can use 
m-m, together with Lemma 14.61 and the convolution formula, to com¬ 
pute the conditional lifetime distribution. In order to simplify the notation 
we define C k as in (1241) . By conditioning on the value of X and using (1191) . 
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we then obtain 


P[L < s | B a {j[) < oo,..., B jk _ 1 (j' k ) < oo, B jk ( 0) < oo] 
f s N k (s-u ) c T e Cr-7D«(_T)i 

Jo At (l- 7 eJ fc (7/-T)- 1 l) “ 

= C*, jcm fc ejjl - e (T_ 7 l)s ](7l - T ) _1 - a J e T ( s ~ u) u k e] k e ( - T ~' rI)u du 

-v k J\ A(k) ^w k ej/ T -^ u du\ (-T)l 
= C k {au k ej k [I - e( T-7l ) s ] ( 7 I - T ) -1 - aX 12 (s) - v fc Ji fc (s)} (-T)l, 
where, by Lemma, [4.41 

Z 12 (s) = Jifc(s) = [I mfc , 0 mA;xm ]e cW ' s [0 mxmfe , I m ] T 


with 


B ( fc ) 


T 

0 


u * e J fc 

T-7I 


c (fc) 


A (fc) w k e] k 

0 T — 7 I 


□ 


5 The uniform observation scheme 


In this section, we assume that an observer samples an individual in a popu¬ 
lation at a single random time T a in accordance with a uniform distribution 
on [ 0 ,f], for some time t > 0 , where the clock is set at the birth of the 
individual. We then ask the same questions as in the Poisson observation 
scheme, but we expect different answers. The three random variables of in¬ 
terest A a ,Yj, and Z 3 are illustrated in Figure [21 Their respective conditional 
distribution, under the uniform observation scheme, is provided in the next 
three propositions. 


Proposition 5.1 The conditional age distribution of the individual at a ran¬ 
dom observation time uniformly distributed on [ 0 ,i], given ip 0 = j, is given 
by 


P[A 0 < s | ip 0 = j] 


a[exp(Qs) — exp(Qf)] (—Q) 1 e j 
a [I — exp(Qf)](—Q) _1 ej 


for s <t, (26) 


and P[A a < s \ (p 0 — j] — 1 f or s > t. 


17 








0 t 



Figure 2: A Possible trajectory of the phase process until absorption. The 
observation time is uniform on [0,t] and is represented by a ▼ symbol. The 
observed phase is j = 3. 


Proof. By conditioning on the value of the observation time T a , we have 
P[^o = j] = (1/0 [ ae Q “e,- du = (1 /t) a [I - exp(Qf)](-Q) _1 ej, (27) 

Jo 

and 


P [A 0 < s,<p 0 = j] = (1/f) / oce Qu ejt{ u < s }du 

Jo 

pmin (s,t) 


= (VO 


cte^ejdu 


= (1/0 a[I — exp(Qmin(s, 0)] (—Q) 1 e J -. 

The conditional distribution (l26j) then follows by dividing (l28li by (l27lh 


(28) 

□ 


Proposition 5.2 The conditional distribution of Yj, given (f a 
point mass at zero given by 


P[Yj = 0 | <p Q = j\ 


and for 0 < y < t, 


aj( 1 — e Xjt ) 

A j a [I — exp(Q0](—Q) _1 e/ 


j, has a 


(29) 


P[Yj <y\po = j} 


A jdtle^y — e Qt ](— Q) 1 e J - + a.e ciy ej(e Xj J ^ — 1 ) 
A j a [I - exp(Q0](—Q) _1 e,- 


,(30) 
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where A j = —Qjj. Finally, P[Yj <y\tp 0 = j] = l for y > t. 

Proof. The proof follows exactly the same lines as in the Poisson observation 
case. For any 0 < y < t, 

P[Yj e [; y,y + dy ], (p 0 = j] 

= (1 /t) / P [Yj e [y,y + dy ], p(u) = j\T a e [u,u + du ]] du 
Jo 

= (1 /t) [ Y,( aeQu )kQk j dye- x ^- y Uu 

y k^j 

= (l/t)ae Qy (Q + Xfi)ejdy f e~ x ^ u ~ y) du 

Jy 

„ 1 - e ~ Xj d-y) 

= (l/t)a.e cly (Q + \jl)ejdy ---. (31) 

A? 

Similarly, 

P w adl - e~ Xj d~y)) 

PK = 0. (ft, = j] = (I/O / du = (1/t) 

jy 

which, together with (071) . leads to OH . Finally, from OH . Oil) , and 071) 
we obtain (1301) . □ 


Proposition 5.3 The conditional distribution of Zj, given ip 0 = j, is given 
by 


PIZ <z\v -i 1-1 a|1 - eQ( ‘"' >1 tSLY Lp for - <t ( 32 , 
P[Z,<z\ip 0 -]\-\ a [I-f3*](-Q) i C; '° r (32) 

and P[Zj < z \ (p Q — j] — 1 for z > t. 

Proof. By the usual arguments, 

P [Zj E [z,z + dz \, (p 0 = j ] 

~\nZ 


= m 


a j e 


+ 


[ J2( aeQ(U ~ Z) ^Q^ e ~ XjZ du 

Jz k^j 


dz, 


where the first term in the bracket accounts for the case where “T 0 = z'\ that 
is, the individual is observed in her/his initial phase. We then have 

P [Zj E [z,z + dz], tp 0 = j} 


= m 


a^e XjZ + a 


e Q{u ~ z) du (Q + A I )ej(. x - z 


dz 


(1/t) [aije~ XjZ + a [I - e Q ( t "*)](-Q)- 1 (Q + A, I) e^*] dz , (33) 
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and the conditional density function of Zj, given ip a = j, is obtained by 
dividing (1331) by (1771) and rearranging the terms in the numerator: 


fZj\p 0 =j{z) — 


ck [Xje-^ z - e Q* e -( Q+A ^(Q + AjI)] (-Q )’ 1 e,- 


a [I — e Qt ](—Q) 1 ej 
Then, as P[Zj < z \ (p 0 = j] = f 0 ~ fz j \tp 0 =j( u ) du, we obtain ()32]h 


□ 


6 Rare observation limit 


In practice, individuals of an animal population are usually observed very 
seldom. This is particularly true for endangered wild populations such as 
the Chatham Island black robins Petroica traversi , which are observed once 
or twice per year (per individual) on average. We are therefore interested 
in the limit of the conditional age distribution as 7 —> 0 in the Poisson 
observation scheme, or as t —> 00 in the uniform observation scheme. First, 
observe that 

T ^ Q, and e Qt 0. 

An interesting consequence of Propositions 14. II and 15.II is that the rare obser¬ 
vation limit of the age distribution is identical for the Poisson and uniform 
observation schemes. In addition, the limiting age distribution corresponds 
to the age distribution Fj(s) that we derived in Lemma Id.II under the as¬ 
sumption that the birth process is Poisson. 


Corollary 6.1 The rare observation limit of the conditional age distribution, 
given ip Q = j, is given by 


P[A 0 < s | Lp 0 = j] 


aexp(Qs) (—Q) 1 e j 
a(-Q)- 1 e j 


*;■(*)■ 


(34) 


□ 


Actually, not only is the rare observation limit of the age distribution 
identical for the Poisson and the uniform observation schemes, but this holds 
for the conditional distribution of Yj and Zj too, as shown in the next two 
corollaries. These results are direct consequences of Propositions I4~T1 and 1771 
and Propositions 14.31 and 15.31 respectively. 
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Corollary 6.2 The rare observation limit of the conditional distribution of 
Yj, given ip Q = j, has a point mass at zero and is given by 



(35) 


and for y > 0 


P[Yj <y\g>o = j] 


ae Q!/ (- Q) X (Q + A j I )ej 
A? a {— Q) le j 


(36) 


where A j = —Qjj- 


□ 


Corollary 6.3 The rare observation limit of the conditional distribution of 


Zj, given ip D = j, is exponential with parameter A j. 


□ 


7 Numerical illustrations 

We illustrate the results of the previous sections on a toy example first, and 
then on the real-world example of the Chatham Island black robin Petroica 
traversi population. 

7.1 Toy example with five phases 

We consider a PH(ck, Q) lifetime distribution with m — 5 transient phases 
and transition rate matrix 


-3 2 

-5 3 

Q= 1 -4 2 

1 -6 3 

1 -2 


where a blank space represents a zero entry. We shall consider two initial 
distribution vectors: 


c* (1) = [1, 0, 0, 0, 0], and a (2) = [1/5,1/5,1/5,1/5,1/5]. 


In the first case, the process starts in phase 1 almost surely, while in the 
second case, the initial phase is chosen uniformly on the transient phase 
space. As we show below, the initial distribution vector can affect the shape 
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of the various conditional distributions. We choose to represent densities 
(rather than distribution functions) as they better capture the features of 
the distributions. 

We first assume a single observation, and we condition on the observed 
phase being j = 4. Figure [3] shows the conditional age densities obtained 
under the different observation schemes, for different values of the parameters 
7 and t, as well as the rare observation limit. Observe the discontinuity of 
the density at s — t in the uniform case (while the distribution function given 
in (j26]) is continuous at s = t). This comes from the fact that an individual 
observed in the time window [ 0 ,t] cannot be older than t , and suggests that 
the Poisson observation scheme is more natural than the uniform observation 
scheme on a finite time interval. We see that for q^, the mode of the 
distribution is positive and tends to increase as the observation becomes 
rare, while for cd 2 ), the mode is clearly at age zero. 

Figures 0] and 0 illustrate the conditional densities of Y 4 and Z 4 respec¬ 
tively. The initial distribution affects the shape of the density of Y 4 , which 
has a point mass at zero for cd 2 ), but has negligible effect on the shape of the 
density of Z 4 . Also note that in the uniform observation case, the density 
of Yj is continuous at .s — t, but similar to the age density in Figure [3l the 
density of Zj is discontinuous at s = t. 

Finally, we consider the multiple Poisson observation scheme with k = 5 
observations, for different values of the parameter 7 , and different sequences 
of observed phases: 1 , 2 , 3, 3,4 (Sequence 1) and 2,3,1,1,4 (Sequence 2 , 
which is less likely than Sequence 1). In Figure [ 6 ] we compare the age dis¬ 
tribution at the 5th observation, and the lifetime distribution given that the 
individual is dead at the 6 th observation, for cr 1 ' (there is not much dif¬ 
ference for a^ 2 )). The graph illustrates how a change in the sequence of 
observed phases affects the related conditional distributions. We see that 
the tail of the distributions corresponding to Sequence 2 is fatter than for 
Sequence 1, that is, an individual is more likely to be older at the time of 
the last observation when Sequence 2 is observed. 

7.2 How old are the Chatham Island black robins Petroica 
traversil 

In this last section, we come back to our original objective, and illustrate 
the usefulness of our results to compute the age pyramid for the black robin 
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Figure 3: Conditional age density at the observation time, given that phase 
j = 4 is observed, for the two observation schemes with different parameter 
values, and the rare observation limit, as a = [1, 0, 0, 0, 0] (top) and a = 
[1/5,1/5,1/5,1/5,1/5] (bottom). 


population during the intensive management period between 1980 and 1989. 

A first step of the analysis consists in modelling the bird population using 
a branching process called Markovian binary tree (MBT), which is done in 
detail in a parallel stud}@. Age-specific mortality and fertility rates of the 
black robins can be estimated from the unique dataset collected between 1980 
and 1989 [2]. These age-specific rates are used to estimate the parameters of 
an MBT that optimally fits the data, and this model is then used to study 
demographic properties of the population during the intensive management 

2 S. Hautphenne, M. Massaro, E. S. Kennedy, and R. Sainudiin. Modelling of the 
Chatham Island black robin Petroica traversi populations using branching processes: In¬ 
formed management strategies for reintroduction of endangered species. In preparation 
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Figure 4: Conditional density of Y), given that phase j = 4 is observed, for 
the two observation schemes with different parameter values, and the rare 
observation limit, as a. = [1,0, 0,0,0] (top) and a. = [1/5,1/5,1/5,1/5,1/5] 
(bottom). The point mass at zero is clear when a. = [1/5,1/5,1/5,1/5,1/5]. 


period. 

In the present section we shall focus on the bird lifetime distribution 
rather than on their reproduction process. The estimated female age-specific 
mortality rates for the period 1980-1989 are shown in Table [0 The age 
class [0,1) corresponds to birds who fledged. Note that the lack of data, 
in particular for the ages above 5, lead to inacurrate estimations for these 
age-classes. Indeed, five females reached age 5 between 1980 and 1989, but 
only one reached age 8, and only one reached age 12 during that period. A 
Bayesian approach was used to bias low relative frequencies upward. 

We assume that the lifetime L of a female black robin has a PH (ck, 0) 
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Figure 5: Conditional density of Zj, given that phase j = 4 is observed, for 
the two observation schemes with different parameter values, and the rare 
observation limit, as a = [1, 0, 0, 0, 0] (top) and a. = [1/5,1/5,1/5,1/5,1/5] 
(bottom). 


distribution with m = 13 (transient) phases with the specific ageing structure 


1 2 3 ... m 

12 ... m 

a = [ 1 0 ... 0 ], 

that is, an individual starts its life in phase 1 and moves through successive 
phases until it dies; in this case, a transition from a transient phase j can 
only be to the next phase j + 1 (with probability Sj), or to the absorbing 
phase 0 (with probabilty 1 — Sj). This particular PH distribution has a 


Q = 


A\ AiSi 

—A2 A2S2 


; (37) 
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—7=5, seq. 1, Age 
■—7 = 5, seq. 1, Lifetime 
--■7=5, seq. 2, Age 
•--■7=5, seq. 2, Lifetime 

.7=2, seq. 1, Age 

■.7 = 2, seq. 1, Lifetime " 

■--7=2, seq. 2, Age 
■--7= 2, seq. 2, Lifetime 


Figure 6: Conditional age/lifetime density at the time of the last Pois¬ 
son observation when the observed sequence is 1,2, 3, 3,4 (Sequence 1) and 
2, 3,1,1,4 (Sequence 2), with a = [1, 0, 0, 0, 0]. 

minimal number of parameters (2 m — 1), which makes it more convenient 
for parameter estimation. The rates A* and probabilities s t were estimated 
following the approach in [7], by minimizing the sum of weighted squared 
errors 

12 

F = ]T(4 - d{x)) 2 S x , 

x=0 

where d x is the observed (estimated) mortality rate in age class [x,x + 1) 
as shown in Table [lj S x = (1 — do)(l — d\) •••(! — d x -±) is the observed 
probability of survival until age class [x, x + 1), and d(x) is the corresponding 
model value for d x , which is computed using ([I]): 

d(x) = P[x < L < x + 1\L > x\ 

P [L > x] — P [L > x + 1] 

P [L > x] 

ae Qx(j _ 

ae^l 

The resulting optimal rates A $ and probabilities s, are provided in Table [2j 
Figure [3 shows the estimated age-specific mortality rates d x (stars) together 
with the mortality function d(x) corresponding to the optimal model (plain 
line). We see that the model mortality curve smoothes the inaccurate point 
estimates in a satisfactory way. 
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Age class [x, x + 1) Mortality rate d x 


[0,1) 

0.19 

[1,2) 

0.23 

[2,3) 

0.36 

[3,4) 

0.27 

[4,5) 

0.40 

[5,6) 

0.50 

[6,7) 

0.33 

[7,8) 

0.33 

[8,9) 

0.67 

[9,10) 

0.33 

[10,11) 

0.33 

[11,12) 

0.33 

[12,oo) 

0.67 


Table 1: Female age-specific mortality rates estimated from the raw data 
on the Chatham Island black robin population during the managed phase 
1980-1989. 


Phase i 

A* 


1 

1.54 

0.93 

2 

1.59 

1 

3 

1.26 

0.27 

4 

1.53 

1 

5 

1.67 

1 

6 

2.21 

0.98 

7 

1.86 

1 

8 

1.36 

1 

9 

1.28 

0.06 

10 

0.68 

0.98 

11 

0.76 

0.91 

12 

1.86 

0.55 

13 

1.15 

— 


Table 2: The parameters of the PH distribution modelling the Chatham 
Island black robin lifetime. 
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Figure 7: Estimated age-specific mortality rates (stars) and the age-specific 
mortality curve computed with the optimal Markovian model fitting the data 
(plain line). 


Among other useful properties, the MBT model allows us to compute 
the asymptotic phase frequency in the population, that is, the proportion of 
birds in each of the 13 phases if we let the population evolve for a long period 
of time with the same demographic rates. We denote by /J the asymptotic 
frequency of phase j, and we show the value of f 1 ’ for 1 < j <13 in Figured 
Since the phases do not have any physical interpretation, the asymptotic 
phase frequency does not have much biological interest in its own. However, 
it can be used in combinaison with the results developed in this paper to 
compute the asymptotic age-frequency (also called the age-pyramid), which 
cannot be obtained directly from the MBT model. 

We consider the following eleven age-classes: [0,1), [1, 2),...,[9,10), [10, oo), 
and we denote the asymptotic frequency of the age-class starting at age x as 
/“, 0 < x < 10. We can then compute /“ as 

fx=Yl /f P t age e i x ’ x + !) I P hase = j]- 

1<?<13 

We approximate the probability P[age E [x, x + 1) | phase = j] using the rare 
limit conditional age distribution provided in Corollary 16.11 

The resulting age pyramid for the black robins is depicted in Figure [UJ 
From the shape of the pyramid, we see that the population is rapidly ex¬ 
panding, which was indeed the case during the period of intensive conserva¬ 
tion management in 1980-1989. However, during this period, reproductive 
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Figure 8: The asymptotic phase frequency in the Chatham Island black robin 
population if the population evolves with the demographic rates of 1980-1989. 


outputs were artificially increased through human intervention. By cross- 
fostering black robin offspring to the closely related Chatham Island tomtit 
Petroica macrocephala chathamensis, female black robins were induced to lay 
additional clutches of eggs [8]. This artificially increased reproductive success 
in combination with our assumption that the fertility and mortality rates are 
fixed over a long time period, results in an age pyramid whose shape may 
not be representative of the current population. 

Given that the current black robin population is restricted to only two 
small islands and includes fewer than 250 individuals, the species remains 
endangered |9j. Hence, knowing the age frequency of this population, and 
the associated fertility and mortality rates, is highly relevant to the future 
conservation management of this species; this is investigated in more detail 
in the parallel stud}§|. 

Remark 7.1 In the particular case of an ageing process with structure fzD, 
and, conditionally on phase j being observed, any trajectory of the phase pro¬ 
cess before observation is restricted to the phases 1,2,... ,j. Therefore, for 
j < m, it is sufficient to consider the process restricted to the smaller phase 

3 S. Hautphenne, M. Massaro, E. S. Kennedy and R. Sainudiin. Modelling of the 
Chatham Island black robin Petroica traversi populations using branching processes: In¬ 
formed management strategies for reintroduction of endangered species. In preparation 
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Figure 9: The asymptotic age frequency in the Chatham Island black robin 
population if the population evolves with the demographic rates of 1980-1989. 


space { 1 , 2 ,... ,j}, with initial distribution vector and generator 


a 


U) - 


= [1 0 ... 0 ], Q ij) = 


1 2 

—Ai AiSi 

—A 2 


A 2 s 2 


We further define the matrix = Q (j> — 7 / where the identity matrix is 
jxj, and ef as the truncation of e* after its jth entry, fori < j. The matrix 
ingredients ol,Q,T, and ej used in the lemmas, propositions, and corollaries 
in the previous sections can then be replaced by their smaller counterpart ct^\ 
QV\ T®, and ef. 

Finally, observe that with the particular ageing structure (EZD, in the 
Poisson observation scheme, the random variable Yj has the same distribution 
as the time until the process with generator reaches phase j for the first 
time, B(j), conditionally on B(j ) < 00 . 
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